Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 3.794
Filtrar
1.
Health Secur ; 22(2): 93-107, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38608237

RESUMO

To better identify emerging or reemerging pathogens in patients with difficult-to-diagnose infections, it is important to improve access to advanced molecular testing methods. This is particularly relevant for cases where conventional microbiologic testing has been unable to detect the pathogen and the patient's specimens test negative. To assess the availability and utility of such testing for human clinical specimens, a literature review of published biomedical literature was conducted. From a corpus of more than 4,000 articles, a set of 34 reports was reviewed in detail for data on where the testing was being performed, types of clinical specimens tested, pathogen agnostic techniques and methods used, and results in terms of potential pathogens identified. This review assessed the frequency of advanced molecular testing, such as metagenomic next generation sequencing that has been applied to clinical specimens for supporting clinicians in caring for difficult-to-diagnose patients. Specimen types tested were from cerebrospinal fluid, respiratory secretions, and other body tissues and fluids. Publications included case reports and series, and there were several that involved clinical trials, surveillance studies, research programs, or outbreak situations. Testing identified both known human pathogens (sometimes in new sites) and previously unknown human pathogens. During this review, there were no apparent coordinated efforts identified to develop regional or national reports on emerging or reemerging pathogens. Therefore, development of a coordinated sentinel surveillance system that applies advanced molecular methods to clinical specimens which are negative by conventional microbiological diagnostic testing would provide a foundation for systematic characterization of emerging and underdiagnosed pathogens and contribute to national biodefense strategy goals.


Assuntos
Técnicas de Diagnóstico Molecular , Saúde Pública , Humanos , Surtos de Doenças/prevenção & controle , Metagenômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala
2.
Genome Biol ; 25(1): 92, 2024 Apr 11.
Artigo em Inglês | MEDLINE | ID: mdl-38605401

RESUMO

BACKGROUND: In the metagenomic assembly of a microbial community, abundant species are often thought to assemble well given their deeper sequencing coverage. This conjuncture is rarely tested or evaluated in practice. We often do not know how many abundant species are missing and do not have an approach to recover them. RESULTS: Here, we propose k-mer based and 16S RNA based methods to measure the completeness of metagenome assembly. We show that even with PacBio high-fidelity (HiFi) reads, abundant species are often not assembled, as high strain diversity may lead to fragmented contigs. We develop a novel reference-free algorithm to recover abundant metagenome-assembled genomes (MAGs) by identifying circular assembly subgraphs. Complemented with a reference-free genome binning heuristics based on dimension reduction, the proposed method rescues many abundant species that would be missing with existing methods and produces competitive results compared to those state-of-the-art binners in terms of total number of near-complete genome bins. CONCLUSIONS: Our work emphasizes the importance of metagenome completeness, which has often been overlooked. Our algorithm generates more circular MAGs and moves a step closer to the complete representation of microbial communities.


Assuntos
Metagenoma , Microbiota , Microbiota/genética , Algoritmos , Bactérias/genética , Metagenômica/métodos
3.
Genome Biol Evol ; 16(4)2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38620144

RESUMO

In this perspective, we explore the transformative impact and inherent limitations of metagenomics and single-cell genomics on our understanding of microbial diversity and their integration into the Tree of Life. We delve into the key challenges associated with incorporating new microbial lineages into the Tree of Life through advanced phylogenomic approaches. Additionally, we shed light on enduring debates surrounding various aspects of the microbial Tree of Life, focusing on recent advances in some of its deepest nodes, such as the roots of bacteria, archaea, and eukaryotes. We also bring forth current limitations in genome recovery and phylogenomic methodology, as well as new avenues of research to uncover additional key microbial lineages and resolve the shape of the Tree of Life.


Assuntos
Archaea , Bactérias , Archaea/genética , Bactérias/genética , Genômica , Metagenômica/métodos , Filogenia
4.
Genome Biol ; 25(1): 97, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38622738

RESUMO

BACKGROUND: As most viruses remain uncultivated, metagenomics is currently the main method for virus discovery. Detecting viruses in metagenomic data is not trivial. In the past few years, many bioinformatic virus identification tools have been developed for this task, making it challenging to choose the right tools, parameters, and cutoffs. As all these tools measure different biological signals, and use different algorithms and training and reference databases, it is imperative to conduct an independent benchmarking to give users objective guidance. RESULTS: We compare the performance of nine state-of-the-art virus identification tools in thirteen modes on eight paired viral and microbial datasets from three distinct biomes, including a new complex dataset from Antarctic coastal waters. The tools have highly variable true positive rates (0-97%) and false positive rates (0-30%). PPR-Meta best distinguishes viral from microbial contigs, followed by DeepVirFinder, VirSorter2, and VIBRANT. Different tools identify different subsets of the benchmarking data and all tools, except for Sourmash, find unique viral contigs. Performance of tools improved with adjusted parameter cutoffs, indicating that adjustment of parameter cutoffs before usage should be considered. CONCLUSIONS: Together, our independent benchmarking facilitates selecting choices of bioinformatic virus identification tools and gives suggestions for parameter adjustments to viromics researchers.


Assuntos
Benchmarking , Vírus , Metagenoma , Ecossistema , Metagenômica/métodos , Biologia Computacional/métodos , Bases de Dados Genéticas , Vírus/genética
5.
BMC Bioinformatics ; 25(Suppl 1): 153, 2024 Apr 16.
Artigo em Inglês | MEDLINE | ID: mdl-38627615

RESUMO

BACKGROUND: With the rapid increase in throughput of long-read sequencing technologies, recent studies have explored their potential for taxonomic classification by using alignment-based approaches to reduce the impact of higher sequencing error rates. While alignment-based methods are generally slower, k-mer-based taxonomic classifiers can overcome this limitation, potentially at the expense of lower sensitivity for strains and species that are not in the database. RESULTS: We present MetageNN, a memory-efficient long-read taxonomic classifier that is robust to sequencing errors and missing genomes. MetageNN is a neural network model that uses short k-mer profiles of sequences to reduce the impact of distribution shifts on error-prone long reads. Benchmarking MetageNN against other machine learning approaches for taxonomic classification (GeNet) showed substantial improvements with long-read data (20% improvement in F1 score). By utilizing nanopore sequencing data, MetageNN exhibits improved sensitivity in situations where the reference database is incomplete. It surpasses the alignment-based MetaMaps and MEGAN-LR, as well as the k-mer-based Kraken2 tools, with improvements of 100%, 36%, and 23% respectively at the read-level analysis. Notably, at the community level, MetageNN consistently demonstrated higher sensitivities than the previously mentioned tools. Furthermore, MetageNN requires < 1/4th of the database storage used by Kraken2, MEGAN-LR and MMseqs2 and is > 7× faster than MetaMaps and GeNet and > 2× faster than MEGAN-LR and MMseqs2. CONCLUSION: This proof of concept work demonstrates the utility of machine-learning-based methods for taxonomic classification using long reads. MetageNN can be used on sequences not classified by conventional methods and offers an alternative approach for memory-efficient classifiers that can be optimized further.


Assuntos
Metagenômica , Viverridae , Animais , Metagenômica/métodos , Redes Neurais de Computação , Metagenoma , Aprendizado de Máquina , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos
6.
Microb Genom ; 10(4)2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38630611

RESUMO

The ever-decreasing cost of sequencing and the growing potential applications of metagenomics have led to an unprecedented surge in data generation. One of the most prevalent applications of metagenomics is the study of microbial environments, such as the human gut. The gut microbiome plays a crucial role in human health, providing vital information for patient diagnosis and prognosis. However, analysing metagenomic data remains challenging due to several factors, including reference catalogues, sparsity and compositionality. Deep learning (DL) enables novel and promising approaches that complement state-of-the-art microbiome pipelines. DL-based methods can address almost all aspects of microbiome analysis, including novel pathogen detection, sequence classification, patient stratification and disease prediction. Beyond generating predictive models, a key aspect of these methods is also their interpretability. This article reviews DL approaches in metagenomics, including convolutional networks, autoencoders and attention-based models. These methods aggregate contextualized data and pave the way for improved patient care and a better understanding of the microbiome's key role in our health.


Assuntos
Aprendizado Profundo , Microbioma Gastrointestinal , Microbiota , Humanos , Metagenoma , Metagenômica/métodos
7.
Front Cell Infect Microbiol ; 14: 1329235, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38638828

RESUMO

The metagenomic next-generation sequencing (mNGS) method is preferred for genotyping useful for the identification of organisms, illumination of metabolic pathways, and determination of microbiota. It can accurately obtain all the nucleic acid information in the test sample. Anthrax is one of the most important zoonotic diseases, infecting mainly herbivores and occasionally humans. The disease has four typical clinical forms, cutaneous, gastrointestinal, inhalation, and injection, all of which may result in sepsis or meningitis, with cutaneous being the most common form. Here, we report a case of cutaneous anthrax diagnosed by mNGS in a butcher. Histopathology of a skin biopsy revealed PAS-positive bacilli. Formalin-fixed paraffin-embedded (FFPE) tissue sample was confirmed the diagnosis of anthrax by mNGS. He was cured with intravenous penicillin. To our knowledge, this is the first case of cutaneous anthrax diagnosed by mNGS using FFPE tissue. mNGS is useful for identifying pathogens that are difficult to diagnose with conventional methods, and FFPE samples are simple to manage. Compared with traditional bacterial culture, which is difficult to cultivate and takes a long time, mNGS can quickly and accurately help us diagnose anthrax, so that anthrax can be controlled in a timely manner and prevent the outbreak of epidemic events.


Assuntos
Antraz , Dermatopatias Bacterianas , Masculino , Humanos , Antraz/diagnóstico , Inclusão em Parafina , Formaldeído/uso terapêutico , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Metagenômica/métodos , Sensibilidade e Especificidade
8.
J Med Virol ; 96(5): e29610, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38654702

RESUMO

In 2022, a series of human monkeypox cases in multiple countries led to the largest and most widespread outbreak outside the known endemic areas. Setup of proper genomic surveillance is of utmost importance to control such outbreaks. To this end, we performed Nanopore (PromethION P24) and Illumina (NextSeq. 2000) Whole Genome Sequencing (WGS) of a monkeypox sample. Adaptive sampling was applied for in silico depletion of the human host genome, allowing for the enrichment of low abundance viral DNA without a priori knowledge of sample composition. Nanopore sequencing allowed for high viral genome coverage, tracking of sample composition during sequencing, strain determination, and preliminary assessment of mutational pattern. In addition to that, only Nanopore data allowed us to resolve the entire monkeypox virus genome, with respect to two structural variants belonging to the genes OPG015 and OPG208. These SVs in important host range genes seem stable throughout the outbreak and are frequently misassembled and/or misannotated due to the prevalence of short read sequencing or short read first assembly. Ideally, standalone standard Illumina sequencing should not be used for Monkeypox WGS and de novo assembly, since it will obfuscate the structure of the genome, which has an impact on the quality and completeness of the genomes deposited in public databases and thus possibly on the ability to evaluate the complete genetic reason for the host range change of monkeypox in the current pandemic.


Assuntos
Genoma Viral , Metagenômica , Vírus da Varíola dos Macacos , Varíola dos Macacos , Sequenciamento por Nanoporos , Sequenciamento Completo do Genoma , Humanos , Genoma Viral/genética , Metagenômica/métodos , Sequenciamento por Nanoporos/métodos , Varíola dos Macacos/epidemiologia , Varíola dos Macacos/virologia , Vírus da Varíola dos Macacos/genética , Vírus da Varíola dos Macacos/isolamento & purificação , Sequenciamento Completo do Genoma/métodos , Nanoporos , DNA Viral/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos
9.
BMC Cancer ; 24(1): 521, 2024 Apr 25.
Artigo em Inglês | MEDLINE | ID: mdl-38658858

RESUMO

BACKGROUND: Emerging evidence suggests that the gut microbiota is associated with various intracranial neoplastic diseases. It has been observed that alterations in the gut microbiota are present in gliomas, meningiomas, and pituitary neuroendocrine tumors (Pit-NETs). However, the correlation between gut microbiota and craniopharyngioma (CP), a rare embryonic malformation tumor in the sellar region, has not been previously mentioned. Consequently, this study aimed to investigate the gut microbiota composition and metabolic patterns in CP patients, with the goal of identifying potential therapeutic approaches. METHODS: We enrolled 15 medication-free and non-operated patients with CP and 15 healthy controls (HCs), conducting sequential metagenomic and metabolomic analyses on fecal samples to investigate changes in the gut microbiota of CP patients. RESULTS: The composition of gut microbiota in patients with CP compared to HCs show significant discrepancies at both the genus and species levels. The CP group exhibits greater species diversity. And the metabolic patterns between the two groups vary markedly. CONCLUSIONS: The gut microbiota composition and metabolic patterns in patients with CP differ significantly from the healthy population, presenting potential new therapeutic opportunities.


Assuntos
Craniofaringioma , Fezes , Microbioma Gastrointestinal , Neoplasias Hipofisárias , Humanos , Craniofaringioma/metabolismo , Masculino , Feminino , Adulto , Neoplasias Hipofisárias/metabolismo , Neoplasias Hipofisárias/microbiologia , Fezes/microbiologia , Pessoa de Meia-Idade , Estudos de Casos e Controles , Adulto Jovem , Adolescente , Metabolômica/métodos , Metagenômica/métodos , Metaboloma
10.
Genome Med ; 16(1): 61, 2024 Apr 25.
Artigo em Inglês | MEDLINE | ID: mdl-38659008

RESUMO

BACKGROUND: Implementation of clinical metagenomics and pathogen genomic surveillance can be particularly challenging due to the lack of bioinformatics tools and/or expertise. In order to face this challenge, we have previously developed INSaFLU, a free web-based bioinformatics platform for virus next-generation sequencing data analysis. Here, we considerably expanded its genomic surveillance component and developed a new module (TELEVIR) for metagenomic virus identification. RESULTS: The routine genomic surveillance component was strengthened with new workflows and functionalities, including (i) a reference-based genome assembly pipeline for Oxford Nanopore technologies (ONT) data; (ii) automated SARS-CoV-2 lineage classification; (iii) Nextclade analysis; (iv) Nextstrain phylogeographic and temporal analysis (SARS-CoV-2, human and avian influenza, monkeypox, respiratory syncytial virus (RSV A/B), as well as a "generic" build for other viruses); and (v) algn2pheno for screening mutations of interest. Both INSaFLU pipelines for reference-based consensus generation (Illumina and ONT) were benchmarked against commonly used command line bioinformatics workflows for SARS-CoV-2, and an INSaFLU snakemake version was released. In parallel, a new module (TELEVIR) for virus detection was developed, after extensive benchmarking of state-of-the-art metagenomics software and following up-to-date recommendations and practices in the field. TELEVIR allows running complex workflows, covering several combinations of steps (e.g., with/without viral enrichment or host depletion), classification software (e.g., Kaiju, Kraken2, Centrifuge, FastViromeExplorer), and databases (RefSeq viral genome, Virosaurus, etc.), while culminating in user- and diagnosis-oriented reports. Finally, to potentiate real-time virus detection during ONT runs, we developed findONTime, a tool aimed at reducing costs and the time between sample reception and diagnosis. CONCLUSIONS: The accessibility, versatility, and functionality of INSaFLU-TELEVIR are expected to supply public and animal health laboratories and researchers with a user-oriented and pan-viral bioinformatics framework that promotes a strengthened and timely viral metagenomic detection and routine genomics surveillance. INSaFLU-TELEVIR is compatible with Illumina, Ion Torrent, and ONT data and is freely available at https://insaflu.insa.pt/ (online tool) and https://github.com/INSaFLU (code).


Assuntos
COVID-19 , Biologia Computacional , Genoma Viral , Metagenômica , SARS-CoV-2 , Software , Metagenômica/métodos , Humanos , SARS-CoV-2/genética , SARS-CoV-2/classificação , COVID-19/virologia , Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Internet , Genômica/métodos
11.
Nat Commun ; 15(1): 3421, 2024 Apr 23.
Artigo em Inglês | MEDLINE | ID: mdl-38653968

RESUMO

The emergence of bacterial species is rooted in their inherent potential for continuous evolution and adaptation to an ever-changing ecological landscape. The adaptive capacity of most species frequently resides within the repertoire of genes encoding the secreted proteome (SP), as it serves as a primary interface used to regulate survival/reproduction strategies. Here, by applying evolutionary genomics approaches to metagenomics data, we show that abundant freshwater bacteria exhibit biphasic adaptation states linked to the eco-evolutionary processes governing their genome sizes. While species with average to large genomes adhere to the dominant paradigm of evolution through niche adaptation by reducing the evolutionary pressure on their SPs (via the augmentation of functionally redundant genes that buffer mutational fitness loss) and increasing the phylogenetic distance of recombination events, most of the genome-reduced species exhibit a nonconforming state. In contrast, their SPs reflect a combination of low functional redundancy and high selection pressure, resulting in significantly higher levels of conservation and invariance. Our findings indicate that although niche adaptation is the principal mechanism driving speciation, freshwater genome-reduced bacteria often experience extended periods of adaptive stasis. Understanding the adaptive state of microbial species will lead to a better comprehension of their spatiotemporal dynamics, biogeography, and resilience to global change.


Assuntos
Adaptação Fisiológica , Bactérias , Água Doce , Genoma Bacteriano , Filogenia , Bactérias/genética , Bactérias/classificação , Água Doce/microbiologia , Adaptação Fisiológica/genética , Metagenômica/métodos , Evolução Molecular , Tamanho do Genoma , Proteoma/genética , Proteoma/metabolismo
12.
BMC Bioinformatics ; 25(1): 161, 2024 Apr 23.
Artigo em Inglês | MEDLINE | ID: mdl-38649836

RESUMO

BACKGROUND: Taxonomic classification of reads obtained by metagenomic sequencing is often a first step for understanding a microbial community, but correctly assigning sequencing reads to the strain or sub-species level has remained a challenging computational problem. RESULTS: We introduce Mora, a MetagenOmic read Re-Assignment algorithm capable of assigning short and long metagenomic reads with high precision, even at the strain level. Mora is able to accurately re-assign reads by first estimating abundances through an expectation-maximization algorithm and then utilizing abundance information to re-assign query reads. The key idea behind Mora is to maximize read re-assignment qualities while simultaneously minimizing the difference from estimated abundance levels, allowing Mora to avoid over assigning reads to the same genomes. On simulated diverse reads, this allows Mora to achieve F1 scores comparable to other algorithms while having less runtime. However, Mora significantly outshines other algorithms on very similar reads. We show that the high penalty of over assigning reads to a common reference genome allows Mora to accurately infer correct strains for real data in the form of E. coli reads. CONCLUSIONS: Mora is a fast and accurate read re-assignment algorithm that is modularized, allowing it to be incorporated into general metagenomics and genomics workflows. It is freely available at https://github.com/AfZheng126/MORA .


Assuntos
Algoritmos , Metagenômica , Metagenômica/métodos , Escherichia coli/genética , Análise de Sequência de DNA/métodos , Software , Metagenoma/genética , Genoma Bacteriano
13.
PLoS One ; 19(4): e0301446, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38573983

RESUMO

Reductions in sequencing costs have enabled widespread use of shotgun metagenomics and amplicon sequencing, which have drastically improved our understanding of the microbial world. However, large sequencing projects are now hampered by the cost of library preparation and low sample throughput, comparatively to the actual sequencing costs. Here, we benchmarked three high-throughput DNA extraction methods: ZymoBIOMICS™ 96 MagBead DNA Kit, MP BiomedicalsTM FastDNATM-96 Soil Microbe DNA Kit, and DNeasy® 96 PowerSoil® Pro QIAcube® HT Kit. The DNA extractions were evaluated based on length, quality, quantity, and the observed microbial community across five diverse soil types. DNA extraction of all soil types was successful for all kits, however DNeasy® 96 PowerSoil® Pro QIAcube® HT Kit excelled across all performance parameters. We further used the nanoliter dispensing system I.DOT One to miniaturize Illumina amplicon and metagenomic library preparation volumes by a factor of 5 and 10, respectively, with no significant impact on the observed microbial communities. With these protocols, DNA extraction, metagenomic, or amplicon library preparation for one 96-well plate are approx. 3, 5, and 6 hours, respectively. Furthermore, the miniaturization of amplicon and metagenome library preparation reduces the chemical and plastic costs from 5.0 to 3.6 and 59 to 7.3 USD pr. sample. This enhanced efficiency and cost-effectiveness will enable researchers to undertake studies with greater sample sizes and diversity, thereby providing a richer, more detailed view of microbial communities and their dynamics.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Metagenoma , Análise Custo-Benefício , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , DNA , Solo , Metagenômica/métodos
14.
BMC Bioinformatics ; 25(1): 137, 2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38553666

RESUMO

BACKGROUND: Metagenomic sequencing technologies offered unprecedented opportunities and also challenges to microbiology and microbial ecology particularly. The technology has revolutionized the studies of microbes and enabled the high-profile human microbiome and earth microbiome projects. The terminology-change from microbes to microbiomes signals that our capability to count and classify microbes (microbiomes) has achieved the same or similar level as we can for the biomes (macrobiomes) of plants and animals (macrobes). While the traditional investigations of macrobiomes have usually been conducted through naturalists' (Linnaeus & Darwin) naked eyes, and aerial and satellite images (remote-sensing), the large-scale investigations of microbiomes have been made possible by DNA-sequencing-based metagenomic technologies. Two major types of metagenomic sequencing technologies-amplicon sequencing and whole-genome (shotgun sequencing)-respectively generate two contrastingly different categories of metagenomic reads (data)-OTU (operational taxonomic unit) tables representing microorganisms and OMU (operational metagenomic unit), a new term coined in this article to represent various cluster units of metagenomic genes. RESULTS: The ecological science of microbiomes based on the OTU representing microbes has been unified with the classic ecology of macrobes (macrobiomes), but the unification based on OMU representing metagenomes has been rather limited. In a previous series of studies, we have demonstrated the applications of several classic ecological theories (diversity, composition, heterogeneity, and biogeography) to the studies of metagenomes. Here I push the envelope for the unification of OTU and OMU again by demonstrating the applications of metacommunity assembly and ecological networks to the metagenomes of human gut microbiomes. Specifically, the neutral theory of biodiversity (Sloan's near neutral model), Ning et al.stochasticity framework, core-periphery network, high-salience skeleton network, special trio-motif, and positive-to-negative ratio are applied to analyze the OMU tables from whole-genome sequencing technologies, and demonstrated with seven human gut metagenome datasets from the human microbiome project. CONCLUSIONS: All of the ecological theories demonstrated previously and in this article, including diversity, composition, heterogeneity, stochasticity, and complex network analyses, are equally applicable to OMU metagenomic analyses, just as to OTU analyses. Consequently, I strongly advocate the unification of OTU/OMU (microbiomes) with classic ecology of plants and animals (macrobiomes) in the context of medical ecology.


Assuntos
Microbioma Gastrointestinal , Microbiota , Animais , Humanos , Metagenoma , Microbiota/genética , Biodiversidade , Análise de Sequência de DNA , Metagenômica/métodos
15.
Genome Res ; 34(2): 326-340, 2024 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-38428994

RESUMO

Pacific Biosciences (PacBio) HiFi sequencing technology generates long reads (>10 kbp) with very high accuracy (<0.01% sequencing error). Although several de novo assembly tools are available for HiFi reads, there are no comprehensive studies on the evaluation of these assemblers. We evaluated the performance of 11 de novo HiFi assemblers on (1) real data for three eukaryotic genomes; (2) 34 synthetic data sets with different ploidy, sequencing coverage levels, heterozygosity rates, and sequencing error rates; (3) one real metagenomic data set; and (4) five synthetic metagenomic data sets with different composition abundance and heterozygosity rates. The 11 assemblers were evaluated using quality assessment tool (QUAST) and benchmarking universal single-copy ortholog (BUSCO). We also used several additional criteria, namely, completion rate, single-copy completion rate, duplicated completion rate, average proportion of largest category, average distance difference, quality value, run-time, and memory utilization. Results show that hifiasm and hifiasm-meta should be the first choice for assembling eukaryotic genomes and metagenomes with HiFi data. We performed a comprehensive benchmarking study of commonly used assemblers on complex eukaryotic genomes and metagenomes. Our study will help the research community to choose the most appropriate assembler for their data and identify possible improvements in assembly algorithms.


Assuntos
Metagenoma , Software , Análise de Sequência de DNA/métodos , Algoritmos , Metagenômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos
16.
J Hazard Mater ; 469: 133939, 2024 May 05.
Artigo em Inglês | MEDLINE | ID: mdl-38490149

RESUMO

Wastewater surveillance is a powerful tool to assess the risks associated with antibiotic resistance in communities. One challenge is selecting which analytical tool to deploy to measure risk indicators, such as antibiotic resistance genes (ARGs) and their respective bacterial hosts. Although metagenomics is frequently used for analyzing ARGs, few studies have compared the performance of long-read and short-read metagenomics in identifying which bacteria harbor ARGs in wastewater. Furthermore, for ARG host detection, untargeted metagenomics has not been compared to targeted methods such as epicPCR. Here, we 1) evaluated long-read and short-read metagenomics as well as epicPCR for detecting ARG hosts in wastewater, and 2) investigated the host range of ARGs across the wastewater treatment plant (WWTP) to evaluate host proliferation. Results highlighted long-read revealed a wider range of ARG hosts compared to short-read metagenomics. Nonetheless, the ARG host range detected by long-read metagenomics only represented a subset of the hosts detected by epicPCR. The ARG-host linkages across the influent and effluent of the WWTP were characterized. Results showed the ARG-host phylum linkages were relatively consistent across the WWTP, whereas new ARG-host species linkages appeared in the WWTP effluent. The ARG-host linkages of several clinically relevant species found in the effluent were identified.


Assuntos
Antibacterianos , Águas Residuárias , Antibacterianos/farmacologia , Genes Bacterianos , Vigilância Epidemiológica Baseada em Águas Residuárias , Bactérias/genética , Farmacorresistência Bacteriana/genética , Metagenômica/métodos
17.
Bioinformatics ; 40(4)2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38492564

RESUMO

MOTIVATION: Taxonomic classification of short reads and taxonomic profiling of metagenomic samples are well-studied yet challenging problems. The presence of species belonging to groups without close representation in a reference dataset is particularly challenging. While k-mer-based methods have performed well in terms of running time and accuracy, they tend to have reduced accuracy for such novel species. Thus, there is a growing need for methods that combine the scalability of k-mers with increased sensitivity. RESULTS: Here, we show that using locality-sensitive hashing (LSH) can increase the sensitivity of the k-mer-based search. Our method, which combines LSH with several heuristics techniques including soft lowest common ancestor labeling and voting, is more accurate than alternatives in both taxonomic classification of individual reads and abundance profiling. AVAILABILITY AND IMPLEMENTATION: CONSULT-II is implemented in C++, and the software, together with reference libraries, is publicly available on GitHub https://github.com/bo1929/CONSULT-II.


Assuntos
Algoritmos , Software , Análise de Sequência de DNA/métodos , Metagenômica/métodos , Metagenoma
18.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38487846

RESUMO

Beneficial bacteria remain largely unexplored. Lacking systematic methods, understanding probiotic community traits becomes challenging, leading to various conclusions about their probiotic effects among different publications. We developed language model-based metaProbiotics to rapidly detect probiotic bins from metagenomes, demonstrating superior performance in simulated benchmark datasets. Testing on gut metagenomes from probiotic-treated individuals, it revealed the probioticity of intervention strains-derived bins and other probiotic-associated bins beyond the training data, such as a plasmid-like bin. Analyses of these bins revealed various probiotic mechanisms and bai operon as probiotic Ruminococcaceae's potential marker. In different health-disease cohorts, these bins were more common in healthy individuals, signifying their probiotic role, but relevant health predictions based on the abundance profiles of these bins faced cross-disease challenges. To better understand the heterogeneous nature of probiotics, we used metaProbiotics to construct a comprehensive probiotic genome set from global gut metagenomic data. Module analysis of this set shows that diseased individuals often lack certain probiotic gene modules, with significant variation of the missing modules across different diseases. Additionally, different gene modules on the same probiotic have heterogeneous effects on various diseases. We thus believe that gene function integrity of the probiotic community is more crucial in maintaining gut homeostasis than merely increasing specific gene abundance, and adding probiotics indiscriminately might not boost health. We expect that the innovative language model-based metaProbiotics tool will promote novel probiotic discovery using large-scale metagenomic data and facilitate systematic research on bacterial probiotic effects. The metaProbiotics program can be freely downloaded at https://github.com/zhenchengfang/metaProbiotics.


Assuntos
Metagenoma , Probióticos , Humanos , Algoritmos , Metagenômica/métodos , Bactérias/genética , Idioma
19.
Microbiol Spectr ; 12(4): e0359023, 2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38451230

RESUMO

Shotgun metagenomics enables the reconstruction of complex microbial communities at a high level of detail. Such an approach can be conducted using both short-read and long-read sequencing data, as well as a combination of both. To assess the pros and cons of these different approaches, we used 22 fecal DNA extracts collected weekly for 11 weeks from two respective lab mice to study seven performance metrics over four combinations of sequencing depth and technology: (i) 20 Gbp of Illumina short-read data, (ii) 40 Gbp of short-read data, (iii) 20 Gbp of PacBio HiFi long-read data, and (iv) 40 Gbp of hybrid (20 Gbp of short-read +20 Gbp of long-read) data. No strategy was best for all metrics; instead, each one excelled across different metrics. The long-read approach yielded the best assembly statistics, with the highest N50 and lowest number of contigs. The 40 Gbp short-read approach yielded the highest number of refined bins. Finally, the hybrid approach yielded the longest assemblies and the highest mapping rate to the bacterial genomes. Our results suggest that while long-read sequencing significantly improves the quality of reconstructed bacterial genomes, it is more expensive and requires deeper sequencing than short-read approaches to recover a comparable amount of reconstructed genomes. The most optimal strategy is study-specific and depends on how researchers assess the trade-off between the quantity and quality of recovered genomes.IMPORTANCEMice are an important model organism for understanding the gut microbiome. When studying these gut microbiomes using DNA techniques, researchers can choose from technologies that use short or long DNA reads. In this study, we perform an extensive benchmark between short- and long-read DNA sequencing for studying mice gut microbiomes. We find that no one approach was best for all metrics and provide information that can help guide researchers in planning their experiments.


Assuntos
Genoma Bacteriano , Microbiota , Animais , Camundongos , Análise de Sequência de DNA/métodos , Microbiota/genética , Metagenômica/métodos , DNA , Sequenciamento de Nucleotídeos em Larga Escala/métodos
20.
Microbiome ; 12(1): 58, 2024 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-38504332

RESUMO

BACKGROUND: Microbiota are closely associated with human health and disease. Metaproteomics can provide a direct means to identify microbial proteins in microbiota for compositional and functional characterization. However, in-depth and accurate metaproteomics is still limited due to the extreme complexity and high diversity of microbiota samples. It is generally recommended to use metagenomic data from the same samples to construct the protein sequence database for metaproteomic data analysis. Although different metagenomics-based database construction strategies have been developed, an optimization of gene taxonomic annotation has not been reported, which, however, is extremely important for accurate metaproteomic analysis. RESULTS: Herein, we proposed an accurate taxonomic annotation pipeline for genes from metagenomic data, namely contigs directed gene annotation (ConDiGA), and used the method to build a protein sequence database for metaproteomic analysis. We compared our pipeline (ConDiGA or MD3) with two other popular annotation pipelines (MD1 and MD2). In MD1, genes were directly annotated against the whole bacterial genome database; in MD2, contigs were annotated against the whole bacterial genome database and the taxonomic information of contigs was assigned to the genes; in MD3, the most confident species from the contigs annotation results were taken as reference to annotate genes. Annotation tools, including BLAST, Kaiju, and Kraken2, were compared. Based on a synthetic microbial community of 12 species, it was found that Kaiju with the MD3 pipeline outperformed the others in the construction of protein sequence database from metagenomic data. Similar performance was also observed with a fecal sample, as well as in silico mixed datasets of the simulated microbial community and the fecal sample. CONCLUSIONS: Overall, we developed an optimized pipeline for gene taxonomic annotation to construct protein sequence databases. Our study can tackle the current taxonomic annotation reliability problem in metagenomics-derived protein sequence database and can promote the in-depth metaproteomic analysis of microbiome. The unique metagenomic and metaproteomic datasets of the 12 bacterial species are publicly available as a standard benchmarking sample for evaluating various analysis pipelines. The code of ConDiGA is open access at GitHub for the analysis of microbiota samples. Video Abstract.


Assuntos
Microbiota , Humanos , Bases de Dados de Proteínas , Anotação de Sequência Molecular , Reprodutibilidade dos Testes , Microbiota/genética , Metagenoma/genética , Bactérias/genética , Metagenômica/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...